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I. INTRODUCTION 


Despite all the recent advances in the areas of communications and data coding, 
there are still a large number of applications where the achievable data rate is not 
sufficient to the task. There also exists an even larger number of tasks for which an 
improvement in data compression would enable us to do the job better or more eff- 
ciently. Two prime examples of the types of data for which better coding is desirable 
are digital speech and image data. Both of these data types require an extremely high 
data rate for real time transmission. Both also display a wealth of internal structure 
that can be utilized for compression by a coding system. Finally both of these types 
of signals can be transmitted with a certain amount of distortion and still provide 
the required information. For example, it may be sufficient to maintain intelligibility 
for speech data, and it may be sufficient for an image to display enough detail for an 
analyst to recognize certain key features rather than a faithful bit by bit reproduc- 
tion. This is in contrast to many other types of digital data for which our principle 
interest is to add error correcting capability until the probability of a single bit error 
is vanishingly small. All these factors combine to make improved compression tech- 
niques for speech and image data a worthy goal and thus an active area of research 
in digital signal processing. 

In signals for which we can tolerate some distortion, there must be some method 
for measuring the distortion relative to the original signal. These fall into the two basic 
categories of subjective distortion measures and objective distortion measures. The 
subjective measures are a result of human impressions of the comparison between 
original and distorted versions; while the objective measure has some closed form 


mathematical expression by which we can compare competing systems. For our study 


we desire a data type that has a simple objective measure that corresponds well to the 
results of subjective measures. Fortunately, for image data there exists a distortion 
measure, mean square error, which is both easy to calculate and corresponds fairly well 
to subjective distortion results. Thus in this thesis, we concentrate on the compression 
of image data. 

There are many schemes for compressing image data, but few have been suc- 
cessful in producing good images quality at low data rates. Generally, images are 
coded with each pixel assigned a grey level from 0 to 255. This corresponds to a data 
rate of 8 bits/pixel. Several examples showing how three common coding techniques 
perform at low data rates are shown in the following figures. First, we examine the 
technique of scalar quantization, in which we map the 256 grey levels into a smaller 
number, which can then be transmitted using a smaller number of bits. Figure 1.1 
shows the original at 8 bits/pixel, Figure 1.2 shows scalar quantization at a data rate 
of 4 bits/pixel, Figure 1.3 shows a data rate of 2 bits /pixel, and figure 1.4 shows a 
data rate of 1 bit/pixel. Clearly this technique produces poor results below about 
4 bits/pixel. Second, we examine the technique of delta modulation, in which we 
encode the difference between the current and previous pixels using a raster scan. 
Figure 1.5 shows the original image, Figure 1.6 shows delta modulation at a data rate 
of + bits/pixel, Figure 1.7 shows a data rate of 2 bits/pixel, and Figure 1.8 shows 
a data rate of 1 bit/pixel. While delta modulation is an improvement over scalar 
quantization, it tends to perform poorly below about 2 bits/pixel. Next, we examine 
a transform technique, the two dimensional fast fourier transform (2-D FFT). Fig- 
ure 1.9 shows the original image, Figure 1.10 shows the 2-D FFT at a data rate of 
4 bits/pixel, Figure 1.11 shows a data rate of 2 bits/pixel, and Figure 1.12 shows a 
data rate of 1 bit/pixel. This technique offers fairly good reproduction down to 2 


bits/ pixel. 
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Figure 1.8 Delta Modulation 


at 1 bit/pixel 


Figure 1.7 Delta Modulation 


at 2 bits/pixel 
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Figure 1.12 2- 


Figure 1.11 2-D FFT at 2 bits/pixel 


Now we compare the previous techniques to a more powerful method, vector 
quantization. In vector quantization, we encode an entire block of data using a single 
codeword. This codeword is produced by comparing the block to be encoded with a 
codebook of example blocks and choosing the example which is closest in some sense. 
A comparison of vector quantization and the previous three techniques is made in the 
following figures at a data rate of 1 bit/pixel. Figure 1.13 shows scalar quantization 
at 1 bit/pixel, Figure 1.14 shows delta modulation at 1 bit/pixel, Figure 1.15 shows 
the 2-D FFT at 1 bit/pixel, and Figure 1.16 shows vector quantization at 1 bit/pixel. 
Clearly the technique of vector quantization is superior at low data rates. There 
exist other coding techiques [Ref. 1] such as linear predictive coding and the discrete 
cosine transform which have been successful in image coding, but were not considered 


in these examples for the sake of brevity. 


A. THESIS OBJECTIVE 


Vector quantization has not been commonly used because of the large compu- 
tational cost involved in generating the codebook and finding the closest codeword 
for each block to be transmitted. Recently, a revived interest in research in neural 
networks has shown some promise for an efficient implementation of vector quanti- 
zation. In this task we benefit from the neural network’s ability to quickly perform 
categorizations (which accelerates the codebook generation), and also from the par- 
allel processing capability (which speeds the process of comparing an input block 
to the codebook). This thesis concentrates on addressing the difficulties in imple- 
menting vector quantization using neural networks. Full Search, tree search, and 
multistage VQ schemes are studied for this purpose. Results of the application of 


these techniques to image data are presented. 
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B. THESIS OUTLINE 

In the second chapter we describe the basic theory of vector quantization. in- 
troduce the existing VQ algorithms, and present some simple examples of how a 
codebook 1s generated. In the third chapter we introduce the basic concepts of neural 
networks, discuss the types of neural network learning, and present the algorithms 
which can be applied to the problem of vector quantization. In the fourth chapter we 
identifv the shortcomings of existing neural network vector quantizers, and apply the 
tree search. multi stage and classification vector quantization schemes in an effort to 


improve performance. 


Il. VECTOR QUANTIZATION 


A. INTRODUCTION 

One of the results of Shannon's rate-distortion theory [Ref. 3] is that better 
results can always be obtained if vectors are used in coding rather than scalars. 
This result applies even if some technique has been applied to the input data to 
remove all correlation. Although delta modulation and transform methods provide 
substantial improvement over scalar quantization, they all use scalar coding and are 
thus suboptimal. As we saw in the examples presented previously, vector quantization 
provides a dramatic improvement in reproduction quality for low data rates. In this 
chapter we examine the technique of vector quantization as it applies to images and 


review existing methods of implementation. 


B. DETAILS OF THE METHOD 


Vector Quantization (VQ) [Ref. 2] consists of two sets of mappings: an encoder, 
+(x), which assigns a channel codeword, и = (и, из,...,и,), to each input vector, 
X — (1o, Z1,.... 2&1), from a set of possible channel symbols called a codebook, and 
a decoder, 8(u), which assigns a code vector, y, to each channel codeword. Note that 
each input vector is just a vector version of a block from the subject image with each 
pixel value corresponding to an element in the vector. The channel symbols consist 
of all possible binary p-tuples, where the size of the input vector and the channel 
codeword are in general not the same. The number of possible channel codewords is 
2°, and thus the bit rate of the vector quantizer is p bits/block or r = p/k bits/pixel. 
It is interesting to note that by properly selecting p and k, we can generate any 


fractional bit rate that we desire. This is in contrast to scalar quantization where 


we are limited to integer bit rates. Figure 2.1 shows the basic structure of a vector 


quantizer system. 





Codebook 


Reproduction 


Figure 2.1: Vector Quantization 


The basic goal of the vector quantizer design is to discover which specific set of 
encoder and decoder mappings will give the best reproduction of the image in some 
sense. This depends upon the cost function or distortion which we use to measure 
the quality of the output image. We wish to find a distortion or distance measure 
between our input image, X, and the output image, Х, that is easy to compute and 
provides good correspondence with subjective image quality. The measure chosen for 


this work is the mean square error (MSE) which is defined as 


Қ 1 А 1 N N А 
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where N is the number of pixels along one side of the image. 
We will now examine the conditions under which the vector quantizer will be 
optimal. First we define the set of all possible code vectors, y, ass C = [y: Vy € 


B(u)|. This set of code vectors together with the corresponding codewords is called 
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the codebook. For the vector quantizer to be optimal it must display two properties 
(Ref. 5]. 

First. the encoder must select the code vector which is closest to the input 
vector according to the distortion measure. In our case this 1s the mean square error. 


This can be stated as 


М» 
to 
э 


а(х, 3[5(х)]) = min d[x, B(u)] = паде КРОН 2 


Thus the encoder can be thought of as a device which partitions the input vector 
space into sections which surround a code vector. Any input which falls into that 
section will be assigned the codeword corresponding the code vector contained in that 
section. The encoder can also be viewed as a device which divides the input vector 
space into a group of sections for which all input vectors occuring in a section are 
grouped together and transmitted as a single representative code vector. 

Second, for an encoder y, the decoder must assign as the code vector the gener- 
alized centroid of all the vectors which are encoded into that code word. In our case 


the centroid can be expressed 


y = В(ч) = cent(u) — Ta) У хі (2.3) 


X,:y(X,;)zu 


where :(u) 1s the number of input vectors that are mapped to u. This is the selection 
of the code vector which will minimize the distortion, E|d(x,y) | y(x) = uj] for a 
particular encoder. 

If we carefully examine the previous two properties, we can see that the first 
gives us a method to optimize an encoder for a given decoder, and the second gives us a 
method to optimize a decoder for a given encoder. This suggests an iterative technique 


of applying these two properties successively until convergence is obtained or some 
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desired distortion level is reached. This is in fact the basis for the generalization of 
Llovd's optimal scalar quantizer [Ref. 4], which was produced by Linde. Buzo and 
Gray [Ref. 6] and is referred to as the LBG algorithm. The algorithm consists of the 


following steps: 
e Step 1 Choose an initial decoder. 


e Step 2 Encode the image using the given decoder (optimize the encoder as in 


the first property). If the distortion is small enough, terminate the algorithm. 


e Step 3 For each codeword u replace the corresponding code vector with the 
centroid of all input vectors that mapped to u in the encoder produced by step 


1 (optimize the decoder as in the second property). Then repeat step 2. 


Тре last detail to be addressed is the selection of the initial decoder. Clearly in 
an iterative technique such as this, a good initial selection can make a large difference 
in the number of iterations required for convergence to the final result. Several tech- 
niques have been employed (See for example [Ref. 7]). First, we can just select the 
appropriate number of input vectors from the image and use them as the code vec- 
tors in the initial codebook. Second, we can apply a scalar quantizer to each element 
of the vector and generate the number of values needed to form the code vectors. 
Lastly, we can use a technique known as splitting. In this technique we start with a 
codebook of size 1, which is just the centroid of the entire data set. Then we split the 
code vector by adding and subtracting a small vector from the original code vector 
and optimize this new codebook of size two. Then we split the two resulting code 
vectors into a codebook of size 4 and optimize this code book as well. We continue 
this pattern of splitting and optimization until the desired codebook size 1s reached. 


Of the initialization techniques described above, the splitting technique is most often 
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used because it initializes each step with a good initial guess which limits the number 
of iterations required. 

Now we present a two dimensional example to give an intuitive feel for how the 
splitting algorithm progresses. The data set for the example is presented in Figure 2.2. 
This data set has been chosen to have a simple structure in order to eliminate the 
need for many iterations at each splitting. In this example we attempt to generate a 
vector quantizer of size four. Step 1 is to form the codebook of size one by calculating 
the centroid of the entire data set (See Figure 2.3). Step 2 is to split the size one 
codebook into a size two codebook by adding and subtracting a small vector (See 
Figure 2.4). Step 3 is finding the vector subspaces which define the decision areas for 
the new codebook (See Figure 2.5). Step 4 is to calculate the centroid of each of the 
new vector subspaces and make these centroids the new code vectors (See Figure 2.6). 
Step 5 1s to split the newly generated size two codebook into a size four codebook by 
adding and subtracting a small vector from each code vector (See Figure 2.7). Step 
6 is to calculate new vector subspaces for each of the code vectors (See Figure 2.8). 
Step T is to calculate the centroid of each new subspace and assign them as code 
vectors (See Figure 2.9). Finally, step 8 is to find the vector subspaces corresponding 
to the decision regions for our final codebook (See Figure 2.10) and the algorithm is 
complete. 

For this example it is easy to see where the code vectors for a vector quantizer 
of four should be placed; one at the centroid of each of the four clusters. This is the 
result produced by the LBG algorithm. However, we must keep in mind that this 
example was carefully contrived to eliminate the large number of iterations required 
for optimization at each splitting. A problem from a real data set, even if the di- 
mension and size are the same as our example, would be much more computationally 


expensive. 
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Figure 2.3 Step 1, Centroid of Data Set 
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Figure 2.5 Step 3, New Subspaces 
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Figure 2.6 Step 4, New Centroids 
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Figure 2.8 Step 6, New Subspaces 
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Figure 2.7 Step 5, Second Point Splitting 
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III. NEURAL NETWORKS 


A. INTRODUCTION 

Artificial Neural Networks [Ref. 8] have recently been the subject of intense 
research because of a desire to develop machines which can achieve human-like per- 
formance in such areas as speech and image recognition. After a lengthy period of 
inactivity in this area, the recent development of new algorithms, advances in ana- 
log VLSI techniques, and a new emphasis on parallel computing have contributed to 
major advances in this field. 

Like their biological counterparts, neural networks rely on a large collection of 
simple but highly connected processing elements. This enables the neural network to 
avoid the sequential instruction processing characteristic of the von Neumann com- 
puter, and instead process many possible results in parallel. This property makes 
a neural network an attractive option to investigate in many recognition problems. 
Neural networks are also designed to adaptively update the interconnection weights 
between processing elements in an effort to improve their performance. This adaptive 
updating is termed “learning.” This property allows a neural network to continue to 
function well despite changes in the statistics of the input data. 

A neural network 15 à good tool in pattern recognition because of its ability to 
quickly categorize an input pattern in a previously learned category. However, there 
also exist different algorithms which are equally proficient at taking a data set and 
forming the occurring patterns into categories without supervision. That is, without 
external definition of the categories to be used by the neural network. Thus with 


some modifications, a neural network can be made to do a task which is very similar 


17 


to vector quantization. If we can find the proper way to update the interconnection 
weights and the proper function for the processing elements, we should be able to find 
a configuration that is capable of duplicating the results of the LBG algorithm which 
we saw in the previous chapter. In the following sections we will briefly discuss the 
difference between unsupervised and supervised learning, and how neural networks 


can be applied to the problem of vector quantization. 


B. NEURAL NETWORK LEARNING 

Each processing element of the neural network is connected to many inputs 
X — (zo,24..... ZN-1) (See Figure 3.1). These inputs could originate directly from 
the input to the network, or some or all could arrive from the output of another 
processing element. Each input to the processing element has an associated weight 
wi, which describes the strength of the connection between the associated input node 
and this processing element. Each processing element has an activation level which 
is a function of the inputs and weights. One of the most common activation formulas 
15 


N-1 


jS LOR = (521) 


where 0 1s some threshold. This is just a weighted sum which is thresholded and 
subjected to a function f, which is usually nonlinear. 

Typical neural networks are made up of many of these processing elements 
which are arranged and interconnected in some pattern. This pattern, the activation 
formula discussed above, and the scheme for adaptively updating the weights for each 
processing element are the items which characterize each type of neural network. 

A final property which characterizes a neural network is the manner in which 


it is trained. There are two main categories, namely supervised and unsupervised. 
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1. SUPERVISED LEARNING 
In supervised learning, the neural net is provided a set of desired output 
values for each set of input values presented to the network. These desired output 
values are used in order to update the interconnection weights. A good example 
of a neural network which uses supervised learning is the backpropagation network 
[Ref. 9], which is arranged as in Figure 3.2. The activation function for a typical 


implementation of backpropagation algorithm is 


1 


Ла) = ea 


322) 
which is known as a sigmoid logistic function. The training of the network proceeds 


as follows: 


e Step 1 Initialize the interconnection weights to small random values. 


e Step 2 Present a set of input values and corresponding desired output values 


to the network. 


e Step 3 Apply the activation formula for each processing element until the 


output values have been calculated. 


e Step 4 Update the interconnection weights starting with the output layer and 


moving downwards using the formula 


wij(t + 1) = w(t) + n6jzi (3.3) 


where wj; is the interconnection between node : of the previous layer and node 
j of the current layer, =; is the activation level of node : , 77 is the learning rate. 


The backpropagated error 15 
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а= "uu en y; )(dj = мый) for an output node 
7 | z;(l ~7,) x Oxwj« for an intermediate node 


(3.4) 


where y, is the activation level of node j on the current layer, and d; is the 


desired output for node |. 


Steps 2-4 are repeated until the network weights have converged or the 
error between the output and desired signals is sufficiently low. This method works 
well for a case such as speech recognition where we can collect a large quantity of 
sample data with the correct classification appended to allow training of the network. 
However a network of this sort is of little use for a problem like vector quantization 
in which the neural network must form the desired categories without any external 
guidance. 


2. UNSUPERVISED LEARNING 


A good example of a neural network algorithm that utilizes unsupervised 
learning 15 the competitive learning network shown in Figure 3.3. This algorithm is 
designed to take the set of input vectors and use them to form a set of categories; 
one category for each node on the second level of the network. This is accomplished 
by measuring the proximity of each input vector to the set of weights for each node 
on the second level and adaptively adjusting the weights of the closest node towards 
the input vector. After sufficient training, the network should categorize all input 
vectors which are similar into the same category based on their distance from the 
weight vector of each node. 


The training of the competitive learning algorithm proceeds as follows. 


e Step 1 Initialize the weights from the N input nodes to the M output nodes 


with small random numbers. 


e Step 2 Present an input vector from the data set. 
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Figure 3.3 Competitive Learning Network 
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e Step 3 Compute the distance d;, between the input vector and the weights of 


each output node j using the formula 


N-1 


d; Wh edt) (И) (3.5) 


1=0 


where z;(t) is the input to node 1 at time t, and w,,(t) 1s the interconnection 
weight from input node i to output node j at time t. Note that this distance 
measure is just the unnormalized MSE between the input vector and the weight 


vector of output node Ј. 
e Step 4 Select the output node J" which is closest to the input vector. 


e Step 5 Update the weights of the closest output node 7” using the expression 


wi;(t * 1) 2 wi(t) + n(t)(zi(t) ^ wi(t)) (3.6) 


where n(t) is the time dependent learning rate. 
e Step 6 Get the next input vector and return to step 2. 


We continue training the network until convergence 1s obtained or the average error 
for the entire data set 1s less than some threshold value. 

It is not hard to see the resemblance between vector quantization and the 
task performed by the competitive learning algorithm. To implement VQ, we just 
present each block of the image as an input vector and train the neural network until 
it converges. Then the weight vectors produced for each output node are the code 
vectors for the VQ codebook, and the indices of the output nodes are the correspond- 
ing codewords. After training, the weights are fixed and the codebook 1s transmitted 


to the receiving site. Then each block to be transmitted 1s submitted to the neural 
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network. The index of the closest output node to the input is transmitted as the VQ 
codeword. At the receiving site, the codeword is used as the argument in a lookup 
table in which the codewords and the corresponding code vectors are stored. The 
code vector chosen is then converted into an image block which serves as an approxi- 
mation to the original block. After the codewords for all the blocks in the image are 
transmitted and decoded, the final reproduction image is assembled from the code 
vector approximations. 

The competitive learning algorithm is now applied to the two dimensional 
VQ example presented in the previous chapter. The trajectories of the code vectors 
are presented in Figure 3.4. Notice that the algorithm attempts to represent the 
data with a single code vector. This occurs because the code vector that is closest 
for the first input vector continues to be the closest for all subsequent input vectors. 
Thus none of the other code vectors are ever utilized and their weights are never 
updated. An algorithm such as this clearly does not utilize all its code vectors and 
thus cannot produce an optimum vector quantizer. In the next section we examine 
modifications to the competitive learning algorithm which improve its performance 


as a vector quantizer. 


C. FREQUENCY SENSITIVE COMPETITIVE LEARNING 

As shown in the previous section, the principal problem with using competitive 
learning as a vector quantizer is the under-utilization of the output nodes. This 
problem has been addressed in the literature and several possible solutions have been 
presented. In [Ref. 10], an algorithm referred to as the Self Organizing Map (SOM) 
is introduced. In the SOM, a neighborhood is defined about the closest output node 
and this neighborhood is used to update more than one output node at a time. In 


this technique the update formula becomes 


23 


250 


200 


150 





100 


50 





0 50 100 150 200 250 


Figure 3.4 Competitive Learning 2-D Example 
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wiy(t 1) = 00) + 00006,0) – (0). j € N(57.1) (3.7) 


where j” is the index of the closest output node and WM is the neighborhood defined 
about the closest output node. This neighborhood 1s started with a large size to 
encourage the updating of many output nodes, and then gradually shrunk with time 
as the network converges to generate more fine structure. Finally. the neighborhood 
shrinks to a single node which allows each node to be updated independently. At this 
point the SOM algorithm is identical to the original competitive learning. We can 
see that the improvement in output node utilization comes from establishing a good 
distribution of weight vectors throughout the input vector space and then allowing 
the network to converge. The drawback of this technique is that the resulting network 
takes an excessive number of iterations to reach convergence. 

Another technique termed adding a conscience to competitive learning is pre- 
sented in [Ref. 11]. In this algorithm we generate a new variable, p;, for each output 
node which represents the percentage of the time that a particular node is the closest 


to the input vector. This variable is initialized to zero and updated by : 


_ Jj Q- B)p(t) - B югу =)" 3 
-~ - | (1 — B)p;(t) for j ЕЈ" 3.8) 


where B is a constant which is chosen small enough to prevent random fluctuation in 
the input data from having too large an effect on p;. Then a bias term, 5;, 1s calculated 
using b, = C(1/M — p;), where C is termed the bias constant. This bias term is then 
applied to the distance measure for each output node, and the closest node is chosen 
based on this biased distance, d; — 6;. The result of these modifications is to penalize 
the output nodes that have won the competition frequently. This produces a very 


uniform output node utilization. This algorithm has the advantage of converging 
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quickly while maintaining good output node utilization, but requires twice as many 
distance calculations as the original competitive learning algorithm. 

A variation on the conscience technique discussed above is Frequency Sensitive 
Competitive Learning (FSCL) [Ref. 12]. In this algorithm the distance d;, between 


the input vector and the output node weight vector is modified by: 


d= aon) (3.9) 
where и; is the number of times the output node ; has won the competition and g is 
termed the fairness function, with g(u;) — u; in most cases. The effect of this mod- 
ification is to increase the modified distance for those nodes which win frequently. 
Over many training iterations, the result is a remarkably even node utilization. This 
algorithm preserves the fast convergence of the conscience method and also requires 
us to update only one set of weights for each input vector. In addition, the algorithm 
requires only one set of distance calculations and is thus much faster than the con- 
science method. The FSCL vector quantizer is the basic building block which will be 
used in the algorithms in the next chapter. 

We first apply the FSCL vector quantizer to the same 2-D example for which 
the competitive learning algorithm failed. The trajectories of the code vectors are 
shown in Figure 3.5. The FSCL clearly solves the problem of node utilization and 
produces the same result as the LBG algorithm. 

The FSCL has been applied to the vector quantization of images [Ref. 13] 
and some interesting results have emerged. Figure 3.6 shows the number of training 
iterations required by the FSCL and LBG algorithms. For a small codebook, the 
FSCL has a sizable computational advantage, while for larger codebooks the LBG 
algorithm is more efficient. To get an idea of how codebook size affects reproduction 


quality, we have applied the FSCL algorithm using various codebook sizes. The 
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Figure 3.6 Training Required For LBG and FSCL Algonthms 
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Figure 3.7 Original Image Figure 3.8 FSCL Using a Size 16 
Codebook and a 2x2 Block 





Figure 3.9 FSCL Using a Size 64 Figure 3.10 FSCL Using a Size 512 
Codebook and a 3x2 Block Codebook and a 3x3 Block 
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original image is 256 x 256 pixels (See Figure 3.7) and was divided into blocks of 
various sizes to produce a data rate of 1 bit/pixel for each example. Figure 3.8 shows 
an image produced with a block size of 2 x 2 and a codebook size of 16. Figure 3.9 
shows an image produced using a block size of 3 x 2 and a codebook size of 64. 
Figure 3.10 shows an image produced using a 3 x 3 block and a codebook size of 
512. We can clearly see that the larger codebooks produce a much better quality 
of reproduction at the same data rate. This leaves us with the question of how to 
get the good reproduction quality of large codebooks while also taking advantage of 
the computational efficiency of the FSCL algorithm for generating small codebooks. 
The next chapter demonstrates several techniques that can be applied to the FSCL 
algorithm which allow us to form large codebooks without the excessive amount of 


training required by the original algorithm. 
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IV. ALGORITHM DEVELOPMENT 


A. INTRODUCTION 

The previous two chapters provided an overview of vector quantization, and 
how neural networks have been applied to this problem. Here we investigate the lim- 
itations of existing algorithms, and propose modifications which substantially reduce 
the computational requirements without significant loss in performance. 

As we saw in Figures 3.8-3.10, the reproduction quality of a vector quantizer 
depends strongly on the dimensionality of the vector utilized. We wish to use the 
maximum dimensionality possible, but we are limited by the fact that the codebook 
size grows exponentially with increasing vector dimension. Whether we plan to im- 
plement the neural network by simulation or in hardware, this limitation introduces 
significant difficulties. 

In the case of simulation, the large capacity memory chips available today allow 
us to implement very large codebooks. However, we can see from Figure 3.6 that 
for very large codebooks, the neural network algorithm has a substantially higher 
computational cost than the Linde, Buzo, and Gray (LBG) algorithm. So in order to 
make the neural network simulation useful, we must limit ourselves to small codebooks 
and thus poor performance, or find a way to form a codebook with a large effective 
size by combining many smaller codebooks. 

In the case of hardware implementation, the computational disadvantage of the 
neural network for large codebook size is substantially mitigated by the advantage 
gained from parallel processing. However in this case, the codebook size is now limited 


by the number of processing elements which can be implemented in hardware. Even 
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with expected advances in neural network hardware, it is still important to maximize 
the effective codebook size for a given number of processing elements. 

In the following sections we investigate algorithms which improve the perfor- 
mance for both hardware and simulation implementations. These algorithms allow a 
large codebook to be formed from many small codebooks, and allow a large effective 
code book to be formed using a substantially smaller number of processing elements. 

The vector quantizers we have examined so far are optimal in two senses. First 
the codebook formed produces the minimum MSE possible for the training data 
utilized, and secondly the encoder always picks the codeword corresponding to the 
vector which produces the least distortion for any given input vector. This type of 
algorithm is called full search vector quantization (FSVQ), and it must calculate a 
number of distortions equal to the size of the codebook for each vector processed. As 
noted above, this property makes full search codes impractical except for the case of 
small codebooks. 

We now consider algorithms that produce codes which are suboptimal in both 
senses mentioned above. They may not produce a codebook which produces the 
minimum MSE for the training data, and they may not select the codeword cor- 
responding to the smallest distortion available. However these algorithms produce 
codebooks which have structure that dramatically reduces the computational effort 
required for a given codebook size. Although the performance is degraded relative to 
a full search algorithm, the suboptimal algorithm can offer such a large reduction in 
complexity that a larger codebook may be implemented. This in turn can provide 
better performance at a smaller computational cost than the full search algorithm. 


These algorithms are described below. 
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B. TREE SEARCHED VECTOR QUANTIZATION 

The TSVQ [Ref. 14] design was developed in an attempt to reduce the number 
of distance calculations which must be made to encode a vector. In the neural network 
implementation. not only does the software simulation option also benefit from this 
reduction in distance calculations, but we also see a reduction in the amount of 
training required. This improvement stems from the fact that the structure of the 
TSVQ produces data subsets for which the basic FSCL algorithm vector quantizer 
converges more quickly. 

The TSVQ algorithm is a structure which causes us to search a sequence of 
smaller codebooks rather than a single large one. This is accomplished by arranging 
many small vector quantizers in a tree structure as shown in Figure 4.1. The tree 
is searched starting with the root, and each search of the smaller vector quantizers 
advances one level through the tree. An m level TSVQ is characterized by the m-tuple 
R = (Ri, Ro,... Rm) , which describes the number of bits encoded at each level of 
the tree. So each vector quantizer at level j would have 2%) codewords and p. 
vector quantizers are required to complete level 7. The codebook size for the entire 
structure is 222.2: №. 

The encoding of a vector proceeds by first applying the input vector, x, to 
the vector quantizer at the root of the tree structure. This produces the closest 
code vector, yj, which is our first estimate of x, and the first R, bits of the channel 
codeword. This R,-tuple, u,, also serves as the index of the vector quantizer to be 
searched in the next level. Thus each codeword in level one provides a mapping to 
a vector quantizer in level two. We then present x to the 27? size vector quantizer 
selected at level two which produces a new estimate y2 and the second portion of the 
channel codeword uz. We use the vector (uj, uz) to choose the vector quantizer to 


search at the third level. This process continues until the final level is reached. At this 
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point. we have produced our final estimate Ym and the complete channel codeword. 
usq. 5. 7 Um). This structure allows us to encode a vector using only 5 1., 2 
distance calculations in contrast with the 2^ calculations required for the full search 
method (R = Rı + Rz +... Rm) . Table 4.1 shows some examples of how large the 
computational savings for the encoding step can be for TSVQ. The R vector listed 
in the table describes the architecture of the particular TSVQ structure used in the 


example. This notation 1s explained later in this section. 


TABLE 4.1: Number of Encoding Distance Calculations Required 


R [Block Size | Codebook Size | FSVQ]  ТЗУФ 
Fez | 2x2 | i | 6 | 38 - 








The training of the TSVQ proceeds one level at a time. We first apply the entire 
training set to the FSCL vector quantizer at the root of the tree until convergence 
is obtained. We then use the codebook produced to divide up the data set into 
R4 subsets based on their proximity to the newly generated code vectors. The new 
subsets are then applied to the R, vector quantizers on level two. We proceed in this 
way until the vector quantizers at the final level, m, have been trained. Each vector 
quantizer in the tree is initialized by randomly selecting vectors from the appropriate 
training set. This type of initialization speeds convergence of the neural networks. 

This structure allows us to greatly reduce the number of distance calculations 
necessary for the software simulation case. This is true because the path through the 
tree allows us to ignore the vast majority of code vectors which are far from the input 
vector. TSVQ also displays a property which is termed graceful degradation. This 


means that if the codeword must be truncated due to channel capacity considerations, 
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it will still be possible send a good estimate of the data for this new lower data rate. 
This is in contrast to the full search method, whose codeword conveys no useful 
information if it is truncated. An added benefit of this method is that the structure 
imposed on each of the data subsets applied to the FSCL vector quantizers causes 
them to converge more quickly. This provides a substantial reduction in training 
required for both software and hardware implementations. 

A final benefit of the method is the large reduction in the number of processing 
elements required for hardware implementation. Since the TSVQ algorithm updates 
only the weights of the vector quantizers of the path taken for each input vector 
applied, these are the only vector quantizers that must be realized in hardware. Thus 
we can convert the hardware implementation from a tree structure to a linear structure 
(see Figure 4.2) along with memory and a system to load the appropriate weights for 
each level. Thus we can reduce the number of processing elements required from 
San Ш 08, to SJ, R;. Table 4.2 shows some examples of the number of processing 
elements required if the TSVQ is implemented in hardware using tree structure and 


linear structure. For larger block sizes and code book sizes, the savings is substantial. 


TABLE 4.2: Number of Processing Elements Required 


R [Block Size | Codebook Size | Tree Structure | Linear Structure 
rea, 2x2 | i | 3 | $8 - 











For the simulations, a single 256 x 256 pixel image was utilized. This image 
was divided into blocks of various sizes to achieve a data rate of 1 bit/pixel for each 
example. The 1 bit/pixel provided a standard to allow comparisons between examples 


with different codebook sizes, and provided a challenging enough problem to allow 
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Figure 4.1 Tree Search Vector Quantization 
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Figure 4.2 Linear Hardware Implementation Of TSVQ 
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good comparisons to be made. 

The simulation results for TSVQ are shown in Figures 4.3-4.6. The original 
image is shown in Figure 4.3. TSVQ with a block size of 2 x 2. and a size 16 
codebook constructed from five size 4 codebooks arranged in a two level tree, is shown 
in Figure 4.4. TSVQ with a block size of 3 x 2, and a size 64 codebook constructed 
from nine size 8 codebooks arranged in a two level tree, is shown in Figure 4.5. TSVQ 
with a block size of 3x3, and a size 512 codebook constructed from 73 size 8 codebooks 
arranged in a three level tree, is shown in Figure 4.6. It is easy to see the strong effect 
of codebook size on performance by noting the improvement in subjective quality 
as the codebook size is increased from 16 to 64 to 512. In particular, the larger 
codebook sizes display an image that appears sharper because the small code books 
does not contain a sufficient number of code vectors to represent edges well. Also, the 
small code book does not contain code vectors with enough different grey scales to 
reproduce gradually changing intensities, such as those in the top of the hat or near 
the beam to the left of the hat. This is confirmed by the MSE performance, which is 
displayed in Figure 4.7. Comparing the MSE performance of TSVQ to the full search 
algorithm, we can see that the loss of performance is very small. This is reinforced by 
comparing Figures 4.4-4.6 for TSVQ and Figures 3.8-3.10 for full search, which show 
that the degradation caused by use of the TSVQ method is small in the subjective 
sense as well. 

To give an idea of the refinement taking place at each level, each stage of the 
three stage TSVQ example in Figure 4.6 is shown in Figures 4.8-4.10. The improve- 
ment taking place at each level is clear. We can also get a good idea of what would be 
reconstructed if the code were truncated. Figure 4.8 corresponds to 0.33 bits/pixel. 
Figure 4.9 corresponds to 0.67 bits/pixel, and Figure 4.10 corresponds to 1.0 bit /pixel. 


It is apparent that a degraded but nevertheless useful image is still available if the 
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Figure 4.3 Original Figure 4.4 TSVQ Using a Size 16 
Codebook and a 2x2 Block 





Figure 4.5 TS VQ Using a Size 64 Figure 4.6 TSVQ Using a Size 512 
Codebook and a 3x2 Block Codebook and a 3x3 Block 
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Figure 4.7 Performance vs. Block Size 
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Figure 4.8 TS VQ Using 3x3 Block 
First Stage 





Figure 4.10 TSVQ Using 3x3 Block 
Third Stage 
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Figure 4.9 TSVQ Using 3x3 Block 
Second Stage 
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code is truncated. This is the property termed previously as graceful degradation. 
The improvement in computational cost can be seen in Figure 4.11. For the 
three examples. the savings varied from 68 to 72 percent. In other words the TSVQ 
algoritlim required only about 1/3 to 1/4 the computation. As stated before. this 
advantage is a result of utilizing smaller more efficient codebooks to form a single large 
effective codebook. This large computational advantage is gained at an very modest 
loss of performance. This makes the TSVQ an extremely attractive alternative to the 


FSCL algorithm. 


C. MULTI STAGE VECTOR QUANTIZATION 

We saw in the last section that the TSVQ algorithm offers many advantages 
for neural network vector quantizers, but that some troublesome limitations remain. 
First, for both TSVQ and FSVQ, the load on the channel of transmitting updates for 
very large codebooks can be excessive. Second, even though TSVQ can reduce the 
training effort, a large number of passes through the image is still required for good 
performance. Finally, although TSVQ produces a codebook with structure, it actually 
increases the storage required for the code book. We now examine the application 
of a technique termed Multiple Stage Vector Quantization (MSVQ) (Ref. 15] to the 
basic FSCL vector quantizer. This technique has the advantage of further reducing 
the computational cost and allowing a very efficient hardware implementation. 

Like TSVQ, MSVQ has two or more levels, but instead of working with the 
original input vector at each stage as in TSVQ, MSVQ attempts to encode the error 
generated at the previous level. An m level MSVQ (see Figure 4.12) can be described 
by the m-tuple R = (R,, Ro,..., Rm), where R; is the number of bits used to encode 
the error at level z of the MSVQ. The first level of the MSVQ is just a normal FSCL 


vector quantizer. The input vector, x, is applied to the vector quantizer at level one 
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Figure 4.12 Multi Stage Vector Quantization 
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Figure 4.13 Classification Vector Quantization 
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and the first estimate. yj, is produced along with the first R, bits of the channel 
codeword, u,. Next, the first error vector is formed by taking the vector difference. 
X — y,. This error vector, ej is then applied to the size 2"? vector quantizer at level 
two. which produces an estimate of the error vector. ej, and the next R» bits of the 
channel codeword. So at the second level, our estimate of the input vector is the 
vector sum y2 = y; +e). In the following stages, we continue to form an error vector 
from the previous stage and use a FSCL vector quantizer to encode this error. Each 
stage produces an estimate for the error and a portion of the channel codeword. At 
the last stage, the error vector e4, is encoded and the final estimate of the input 
vector is available by performing Ym = yı + ê + e2 +... + êm-ı, and the full channel 
codeword и = (иу, ш, ш 

For encoding, MSVQ) requires 577 , 2: distance calculations which is the same 
as TSVQ and much less than the 2” required for FSVQ. However, the MSVQ requires 
only m vector quantizers and thus m small codebooks to be stored as compared with 
Liar MR; for TSVQ. Table 4.3 shows the difference in the number of codebooks 
required by TSVQ and for some of the examples used in simulations. This reduces 
the total number of code vectors to be stored from 77, Ii_, A; for TSVQ and Da 
for FSVQ to 527, 2^ for MSVQ. Table 4.4 shows the total number of code vectors 
which must be stored for several examples of FSVQ, TSVQ, and MSVQ. We can see 
that there 1s a storage price to be paid for the computational advantage of TSVQ, 
but the MSVQ provides a large reduction in both. This dramatically reduces both 
storage requirements and the load on the channel from transmitting codebook up- 
dates. Table 4.5 shows the extra load on the channel for each of the three algorithms 
assuming that the codebook is updated with each frame. The advantage of MSVQ 
in this regard for large codebooks is apparent. 


As with TSVQ, the training of the MSVQ proceeds one level at a time. We apply 
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the original data set to FSCL vector quantizer at the first level until convergence is 
obtained. Then we pass the data through the trained vector quantizer and compute 
the error vector between each input vector and the closest code vector. This forms 
a new data set which is a collection of the first stage errors. This first stage error 
data set is then applied to the vector quantizer on the second level until convergence 
is obtained; then it is applied a final time to compute the second stage error vectors. 
This continues until the last stage has been trained. Although the data subsets 
produced by MSVQ do not have the same desirable structure as the data subsets 
from TSVQ, there are far fewer codebooks for MSVQ to train. Indeed we find that 
the smaller number of codebooks outweigh the larger convergence time in all cases 
except for very small overall codebooks. Thus the MSVQ requires significantly fewer 
training passes than TSVQ to reach convergence. 

It is useful at this point to examine the differences between MSVQ and TSVQ. 
Both methods produce a multi-level process, but the processing at each level is sig- 
nificantly different. The TSVQ algorithm presents the original data vector at each 
level, while the MSVQ presents the residual error at each level. TSVQ has an ever 
increasing number of vector quantizers at each level, while MSVQ has a single vector 
quantizer at each level. TSVQ provides increasingly accurate estimates of the in- 
put at each level by systematically dividing the higher dimensional vector space into 
smaller and smaller subspaces into which the input must fall. MSVQ provides an 
initial estimate at the first level, and provides a better estimate at each level by con- 
tinuing to add smaller and smaller correction terms in a way similar to the method of 
successive approximations. Each of these corrections is a result of performing vector 
quantization on the error subspace of the preceding level. In TSVQ, the code vectors 
at intermediate levels are not actually utilized for reconstruction; they are only used 


as pointers to direct the algorithm to the appropriate vector quantizer at the final 
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level. Only code vectors of the vector quantizers at the final level are actually used 
in the image reconstruction. 

The reconstructed images for MSVQ are presented in Figures 4.14-4.17. As in 
the previous results, the simulations were conducted on a single test image of 256 x 256 
pixels. [his image was divided into blocks of various sizes chosen to yield a data rate 
of 1 bit/pixel for each reconstruction using a variety of codebook sizes. The original 
image is presented in Figure 4.3. MSVQ using a 2 x 2 block and a codebook size 
of 16 is presented in Figure 4.14. This codebook was generated using a two level 
architecture containing two code books of size 4. MSVQ using a 3 x 2 block and a 
codebook of size 64 is presented in Figure 4.15. This codebook was generated using a 
two level architecture containing two codebooks of size 8. MSVQ using a 3 x 3 block 
and a code size of 512 is presented in Figure 4.16. This codebook was generated using 
a three level architecture containing three codebooks of size 8. MSVQ using a 4 x 3 
block and a codebook size of 8192 is presented in Figure 4.17. This codebook was 
generated using a three level architecture containing three codebooks of size 16. 

We also present one example of how the image develops through each stage of 
the MSVQ process. Figures 4.18-4.20 show each stage for the example presented in 
Figure 4.16. As we saw with TSVQ, the improvement is each stage is easy to see. 
The property of graceful degradation is also manifested by MSVQ, since the figures 
shown correspond to the lower bit rate images that would be produced if the channel 
codewords were truncated. 

As with FSVQ and TSVQ, we can see that the performance of MSVQ depends 
strongly on the size of the codebook. The performance of MSVQ falls far short of the 
standard set by FSVQ as can be seen in the MSE comparison shown in Figure 4.21. 
The reason for this large degree of suboptimality can be seen in the structure of 


MSVQ. Consider a TSVQ structure in which we formed the data subsets for the next 
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Figure 4.14 MSVQ Using a Size 16 Figure 4.15 MSVQ Using a Size 64 
Codebook and a 2x2 Block Codebook and a 3x2 Block 





Figure 4.16 MSVQ Using a Size 512 Figure 4.17 MSVQ Using a Size 4096 
Codebook and a 3x3 Block Codebook and a 4x3 Block 
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Figure 4.18 MSVQ Using 3x3 Block 
First Stage 





Figure 4.20 MSVQ Using 3x3 Block 
Third Stage 
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Figure 4.19 MSVQ Using 3x3 Block 
Second Stage 
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Figure 4.21 Performance vs. Block Size 
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level using the error vector instead of the original input vector. Since the vector 
quantization process is translation invariant, the performance of this new structure 
would be identical to the original TSVQ. We can also see that this structure is the 
same as MSVQ except that a different codebook is used to encode the error vectors 
for each branch of the tree. Thus MSVQ is equivalent to TSVQ if we assume that the 
probability distribution function which describes the distribution of the errors about 
each code vector on the same level of the tree is identical. That this assumption 1s far 
from the truth accounts for the relatively poor performance of the MSVQ algorithm. 

Although the performance of MSVQ is poor relative to FSVQ and TSVQ for 
codebooks of the same size, MSVQ maintains several highly desirable features. We can 
see from Figure 4.22 that MSVQ provides a huge computational advantage for large 
codebooks. MSVQ also provides an extremely simple structure which would require 
only a small number of processing elements and would make hardware implementation 
much simpler. Finally, because MSVQ uses only one vector quantizer per level, the 
algorithm vastly reduces the amount of storage required for simulation and decreases 


the load on the transmission channel due to codebook transmission. 


D. CLASSIFICATION VECTOR QUANTIZATION 

The refinements to the basic FSCL algorithm that we have examined so far con- 
centrate on reducing the computational cost of training the vector quantizer system. 
Our standard for performance in all cases has been the mean square error. Now we 
take a brief look at the subjective quality of the images produced. The most notice- 
able problem with each of the methods is the staircase effect . This is where an edge 
follows the outline of the blocks rather than the smooth edge of the original image 
as can be seen by examining the curve in the shoulder in Figures 4.3 and 4.5. This 


staircase effect follows the size of the block used in coding the image, and will thus 
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become more and more noticeable as the block size is increased. This puts us in the 
uncomfortable situation of wanting to increase the block size to improve mean square 
error performance and at the same time wanting to reduce the block size to minimize 
this staircase effect. In order to solve this dilemma we need to examine the cause of 
this staircase effect and look at possible solutions. 

One possible cause is that the codebook does not contain a sufficient variety 
of code vectors which represent blocks with edges. To examine this possibility, a 
codebook for a FSVQ with a 2 x 2 block size is presented in table 4.6. The four pixel 
values in each row constitute a code vector. We would expect a code vector which 
represents an edge to contain both high and low values, but upon examining the 
codebook in table 4.6, we see that the code vectors exhibit almost no structure and 
are certainly inadequate to represent all the possible edge configurations. To examine 
the reason for this under-representation of edge blocks, we introduce an edge detector 
which is used to indicate whether an edge appears somewhere in the block. 


For each set of adjacent pixels in the block, we take the pixel values, m; and 


[m1 —m3]| 


m» and form the ratio 
= тах(т1,т2) 


and apply a threshold to determine if this is an 
edge block or a shade block. The results of applying this ratio to our test image 15 
presented in Figure 4.23 for a block size of 2 x 2. The authors of [Ref. 16] chose 
a threshold of 0.4 to define an edge block. Applying this value gives us only 202 
edge blocks out of a total of 16384 blocks in the image. Thus it appears that the 
problem with edges occurs because there are so few edge blocks in the image, and 
the poor representation of these blocks do not contribute significantly to the mean 
square error. So the root of the problem seems to be that the distortion measure, 1.e., 
mean square error, fails to take into account the perceptual importance of the edge 


blocks. This leaves two basic solutions; change to a more complicated, perceptually 


based distortion measure, or divide the problem by using separate vector quantizers 
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Figure 4.23 Histogram of Edge Detector Ratio Values 
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on the edge and shade blocks. 

The technique of Classification Vector Quantization (CVQ) [Ref. 16] ( See 
Figure 4.13) uses the second method discussed above to improve the subjective quality 
of the reconstructed image. The image is divided into blocks as before, but now we 
apply the edge detector and use a threshold to separate the image into two data sets 
one containing the edge blocks and the other containing the shade blocks. These two 
data sets are then applied separately to a FSCL vector quantizer which is trained 
until convergence. The two resulting codebooks are then concatenated to form an 
overall codebook which emphasizes the edge blocks . The amount of emphasis given 
to the edge blocks depends on the sizes of the codebooks allocated to the edges and 
shades. For example, a codebook size of 64 could be divided into 48 shade code 
vectors and 16 edge code vectors. This would give the edges more emphasis than the 
original technique. Even further emphasis would be obtained if we used 32 shade and 
32 edge code vectors instead. 

The simulation results for CVQ are presented in Figures 4.24-4.26. As before 
a single test image of 256 x 256 pixels was used, and all test cases were conducted 
at 1 bit/pixel. Figure 4.24 shows CVQ using a 2 x 2 block and a size 16 codebook 
consisting of 8 edge and 8 shade code vectors. Figure 4.25 shows CVQ using a 3 x 2 
block and a size 64 codebook consisting of 32 edge and 32 shade pixels. Figure 4.26 
shows CVQ using a 3 x3 block and a size 512 codebook consisting of 384 edge and 128 
shade code vectors. For the size 16 case (Figure 4.24) we can see that the codebook 
is just too small to represent shades and edges well. The lack of enough shade code 
vectors to cover the common grey levels is evident, and the few edge code vectors are 
not enough to show much improvement over FSVQ. In the size 64 case (Figure 4.25) 
we start to see some substantial improvement in the reproduction of the edges with 


very little degradation in other areas of the image. Finally, for the size 512 case 
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Figure 4.24 CVQ Using a Size 16 Figure 4.25 CVQ Using a Size 64 
Codebook and a 2x2 Block Codebook and a 3x2 Block 





Figure 4.26 CVQ Using a Size 512 
Codebook and a 3x3 Block 
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(Figure 4.26), CVQ is substantially better in a subjective sense, and for the larger 
codebooks and larger block sizes it 1s shghtly better than FSVQ in the mean square 
error sense (See Figure 4.27). It is surprising that any method could surpass the 
performance of FSVQ since we belieived this method to be optimal in a mean square 
sense. but this effect probably stems from the fact that FSVQ converges very slowly. 
and the test cases were not run a sufficient number of training passes to reach the 
final value. 

A secondary benefit of applying the CVQ technique is an enormous computa- 
tional savings over FSVQ. This occurs because the code vectors for edge and shade 
pixels appear to converge at different rates. The shade code vectors have a very simple 
structure and therefore converge quickly, while the edge code vectors have a complex 
structure and converge slowly. In FSVQ, we use a single codebook and thus all code 
vectors are run through the data set the same number of times. So long after the 
shade code vectors have converged, we continue to waste computational time updat- 
ing them. In CVQ, we avoid this problem, and we are then able to concentrate our 
computational efforts on the difficult part of the problem. Also as we have seen with 
TSVQ. a data set which has a large amount of structure makes the FSCL algorithm 
converge more quickly. The CVQ method accomplishes this by splitting th- original 
data set into shade and edge blocks which further improves convergence speed. As 
we can see in Figure 4.28, CVQ has a huge computational advantage over FSVQ as 


well as better performance for large codebooks. 
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TABLE 4.3: Number of Codebooks Required 








Codebooks 
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TABLE 4.4: Code Vector Storage Requirements 
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TABLE 4.5: Channel Load of Codebook Transmission (bits/pixel) 
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TABLE 4.6: Example Codebook 
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V. CONCLUSIONS 


[n this thesis we examined some existing algorithms to implement vector quan- 
tization using neural networks. We also applied three techniques to improve perfor- 
mance and reduce computational cost. In the previous chapter we presented each 
technique separately. Here we will compare the relative performance of each of the 
three algorithms. Since each algorithm has its strengths and weaknesses, we also 
make suggestions about the likely situations where each of these techniques may be 
appropriate. 

First let us discuss image reproduction quality. It can seen from Figure 4.27 
that for a given ‘lock size, FSVQ, TSVQ, and CVQ all offer a similar level of perfor- 
mance in a mean square sense, while MSVQ performs noticeably worse. To compare 
performance in a subjective sense, we present the best results obtained for each tech- 
nique in the following figures. Figure 5.1 shows the FSVQ algorithm using a3 x 3 
Block, Figure 5.2 shows TSVQ using a 3 x 3 block, Figure 5.3 shows MSVQ using 
a 1 x 3 Block, and Figure 5.4 shows CVQ using à 3 x 3 Block. Here we see that 
CVQ has a small advantage over FSVQ and TSVQ in a subjective sense, and MSVQ 
Is again noticeably worse. 

Now we examine the issue of computational cost. We can see from Figure 4.28 
that for a given block size, FSVQ has the highest computational cost, TSVQ is the 
next highest, and CVQ and MSVQ have very similar and much smaller computational 
costs. Perhaps a better way to rate the computational cost is to relate it to perfor- 
mance. Figure 5.5 shows the relationship between cost and performance for each test 
case performed. Algorithms that are most desirable are represented by points in the 


lower left portion of the graph. We can see that the best combination of reproduction 
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Figure 5.3 MSVQ 
Figure 5.4 CVQ 


62 


Number of Training Passes 


200 


180 


160 


140 


120} 


100 


80 


60 


40 


20 


Computational Cost vs. Performance 


+= Foy ©O 
xz TSVQ 
oz MSVQ 
*=CVQ 


40 60 80 100 120 
MSE 
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quality and computational cost 15 given by the CVQ algorithm. 

Although CVQ offers the best reproduction quality even when computational 
cost is considered. the other two algorithms presented have advantages of their own. 
Both MSVQ and TSVQ offer a huge savings in the number of processing elements 
required because of the linear structure each displays. Thus for a hardware imple- 
mentation these two techniques should be considered. Also we have seen that for 
large code book sizes the load on the channel due to code book transmission becomes 
significant. So if our application requires an extremely large code book the MSVQ 
algorithm must be considered as it is able to form a large code book with very little 
load on the channel (See Table 4.5). 

This research has shown that neural networks can be very effective in the im- 
plementation of vector quantization. With the application of algorithms such as 
CVQ, TSVQ, and MSVQ, we can improve the performance of neural network vector 


quantizers and make application of the vector quantization technique more practical. 


A. ADDITIONAL WORK 

Research 1s planned in the area of adaptive filters in an effort to improve the 
convergence speed of the FSCL algorithm. In addition, it is planned to investigate 
other current vector quantization techniques and determine if neural network vector 
quantizers can be improved by their application. After these steps are completed, an 
effort to combine several of the techniques chosen will be conducted in the hope of 
further improving overall performance. Finally, we plan to apply the techniques in 


this thesis to the coding of speech data. 
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APPENDIX A: PROGRAM DETAILS 


This appendix contains the program flowcharts and listings for each of the 
alyorithms in the thesis. Figures ÀA.1-A.4 show the flowcharts. and the program 


listings follow. 
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Figure A.1 Basic FSCL Algorithm 
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Figure A.2 Tree Search Algorithm 
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Figure A.3 Multi Stage Algonthm 


68 


Image in 
row format 












class.m 


imgconi m 


edge data 
anade 







subset Image in 
data vector 
ОТЕ ТИ Subset format 







coded image 
п ее аја уа 210 


imgcon3.m 


coded image 
in row format 
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function x=imgconl (y) 


$ program to convert an array of image data into vector format using 
% blocks of arbitrary size 





ое 


input variable y = subject image in row format 


€ output variable x Subject image in vector format 


ор 


local variables N = vector which stores the dimensions of y i 
n2 - height of desired block input by user | 
nl - width of block input by user 
nla = number of blocks to process in horizontal direction 


n2a - number of blocks to process in vertical direction 
k - index to track number of blocks processed 

41 = vertical placekeeper in subject image 

jl = horizontal placekeeper in subject image 


z = temporary storage for desired block 


N=size(y); Y initialize dimensions of input image 
n2=input (‘height of block ©}; % дес height og block from user 
nl=input (‘width of block P $ get width of block from user 
nla-floor(N(1)/nl1); $ find F of blocks to process horiz. 
n2a-floor(N(2)/n2); $ find Ё оё blocks to process vertic. 
x=zeros (nl*n2,nla‘n2a); % initialize output 
for i=l:n2a $ main loop: move vertically in image 
11“ (1-1) “п241 5 set vertical placekeeper 
for j-1:nla $ inner loop: move horizontally in image 
k-(i-1)*nla*j; $ track number of blocks processed 
йк (БГ ШШ nD EN, % set horizontal placekeeper 
гу СЕГЕ ЈИ SL LLI, % get desired block from image 
mS, kez); Y make conversion from block to vector 
end 
end 
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Se EO IT OS TOON ‘ebinit2.m Page 1 


function (w,u]=cbinit2 (x) 
€ program to initialize the code book 
% input variable x = data set in vector format 


€ output variable w = initial code book 
5 u initialized frrgoency vector 


il 


% local variables N desired number of code words 
3 Nx = number of data vectors 


$ this program initializes the codebook by randomly selecting data vectors from the 
subject data set. It also sets up hte initial frequency vector for the codebook 
% with all values initialized to l. 


ге 


N=input (‘number of code words ye 


rand(’ uniform’ ) 

Nx=max(size(x)); 

for i=1:N 
чы(:,1})=х(:,се11(Мх*гапа(1))); 

end 

u=ones (1,N); 


|1 


function [w,u]-fscltx, wu) 


input variables x 
мМ 
u 


output variables w 
u 


local variables nx 
Nx 
М 
у 
а 
md 
iw 


ep 


weight matrix, w, 
routine. 


99 99 oP of ор о ор OF هن‎ OP >Р o? эр c? чо от نهن‎ ое сө оғ 


nx=min(size(x)); 
Nx=max (size(x)); 
N-size(w); 
y=ones(1,N(2)); 


for k-1:Nx 


d=sum( (xt 2.4)" y—-w). 2); 


d=d. *u; 
{md, iw) =min(d) ; 


ep=0.01*exp (-u (iw) /10000) ; 


w(:,iw)sw(:,iw)tep*(x(:,k)-w(:,iw)); 


u (iw) =u (iw) +1; 
end 


It 


program to implement frequency sensitive competitive learning 


subject data set arranged into vectors of appropriate size 
existing weight matrix 
existing win frequency vector 


updated weight matrix 
updated win frequency vector 


Size of data vectors 

number of data vectors Caution: Nx must be > nx 

vector contdining the size and number of weight vectors in w 
ones vector used to set up comparison of distances 

vector which stores the distance for each code vector 

the minimum distance contained ind 

the index of code vector with minimum distance 

learning rate 


This program conducts a single pass through data set x using the FSCL algorithm. The 


and win frequency vector, u, are updated and passed back to the calling 


initialize size of data vector 
initialize number of data vectors 
initialize dimensions of weight matrix 
initialize ones vector 


од од оо оғ 


main loop: perform once for each data vector 
calculate distance for each code vector 

apply fairness function to distance 

find minimum distance 

determine learning rate for nearest 

code vector 

update weight vector for nearest code vector 
update number of wins for nearest code vector 


o9 o? o9 o? o? o? o? oe 
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Sep 20 11:04 1991 mse.m Page 1 


function m=mse(x,w,m) 


% program Lo measure mean square error of codebook 


% input variables x = subject data set arranged into vectors of appropriate size 
% w = weight matrix of codebook to be measured 

Y m - record of mse for previous versions of code book 

% 

з output variable m = updated record of mse measurements 

% 

% local variables Nx = number of data vectors Caution: Nx must be > nx 

$ N = vector containing the size and number of weight vectors inw 
Y y - ones vector used to set up comparison of distances 

Y d = vector which stores the distance for each code vector 

% 


msel = accumulator for current mse 


$ this program makes a single pass through the data set in order to measure the mse. 
% the mse is then appended to an existing vector, m, which has the mse record for 
% each iteration of the codebook 


initalize size of weight matrix 
initialize number of data vectors 
initialize mse accumulator 
initialize ones matrix 


N-size(w); 
Nx-max(size(x)); 
msel-0; 

y=ones (1,N(2)); 


об ор об of 


for kz1:Nx $ main loop : execute once for each data vector 
dE SUMARE зу-м) 72); $ calculate for each weight vector 
msel=msel+min (d); $ increment mse accumulator 

end 

msel-msel/(Nx*N(1)); % normalize mse 

m=!m, msel]); $ append new mse value to previous record 
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function [z,mse}=code (x, w) 


Sep 20 11:03 1991 сріпіс? т Pager! 


function [(w,uU)J=cbinit2 (x) 
% program to initialize the code book 
% input variable x = data set in vector format 


$ output variable w = initial code book 
$ U initialized frrquency vector 


% local variables N desired number of code words 
3 Nx = number of data vectors 


$¥ this program initializes the codebook by randomly selecting data vectors from the 
subject data set. It also sets up hte initial frequency vector for the codebook 
¥ with all values initialized to 1. 


ою 


N-input('number of code words ium 


rand(’ uniform’ ) 
Nx=max(size(x)); 
for i-1:N 
м, трассе (мх кате) а 
епа 
о=опеѕ (1, №); 
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function xsimgcon3(y) 


% 
% 


program to convert an image in vector format to an image in row 
format with an arbitrary vector size 


input variable y = subject image in vector format 


Output variable x subject image in row format 


1 


local variables N vector which stores the dimensions of y 
n2 = height of block input by user 
nl width of block input by user 
n3 Size of desired output image 
nla = number of blocks to process in horizontal direction 
n2a = number of blocks to process in vertical direction 
k = index to track number of blocks processed 
il = vertical placekeeper in subject image 
jl = horizontal placekeeper in subject image 
z = temporary storage for desired block 


N=size(y); 5 initialize dimensions of input image 
n2=input (‘height of block д); $ get height of block from user 
nl=input (‘width of block ^Ш; $ get width of block from user 


n3=input (’size of output image 


= 


); 5 get desired output image size 


nla=floor(n3/nl); $ find f of blocks in vert. direction 
n2a-floorí(n3/n2); $ find f of blocks in horiz. direction 
x=zeros (n3,n3); $ initialize output image 
for i*1:n2a $ main loop : move vertically 
Press] $ set vertical placekeeper 
for j=l:nla $ inner loop : move horizontally 
k=(i-1) *nla+j; 5 update number of blocks processed 
ISE ПИЕ $ set horizontal placekeeper 
z-zeros(n2,nl1); $ initalize temporary storage 
for Ве Реп $ loop to convert vector to block 
т= (1-1) *п2+1; % find section of vector to process 
(с=т тїп2—1,к); % дес segment of vector 
end 
х(11:11+п2-1,]1:]1+п1-1)=2; % put completed block into image 
end 


end 
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% program to initialize the codebook for TSVQ 


% variables xl,x2,... = input data sets in vector form 

% wl,w2,... - initial code books 

% ul,u2,... = initialized frequency vectors 

% N = desired number of code words 

% Nx = number of data vectors in data set being processed 

n=input(’ size of input vector ӘЙ % get size of input vector 
N=input (‘number of codewords DE $ get desired number of codewords 
nb-input('number of branches in tree ”); $ get number of branches in tree 


$ this program constructs the code book initialization for TSVQ by randomly 
¢$ chosing input data vectors from each data set 


rand(‘ uniform’ ) 5 set up random number generator 
for p=l:nb $ main loop execute once for each branch of tree 
eval(('Nxssize(x',int2str(p),');']); % get size of current data set 
Nx «(2); 
for ]=1:N 5 inner loop : choose N random vectors from data set 
m=ceil(Nx*rand(1,1)); Y select random number 


eval({’w’ ,int2stripie CE GD =X іпе2вег іру еее! 
$ place selected vector in appropriate code book 


end 
eval(('u',int2str(p),'-0nes(1,N);']); $% initialize frequency counter 
eval ({'m", int2strip), =(— 1); fe 5 initialize mse history 


end 
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$% program to sort vector for tree searched code 


Y variables Nx = number of data vectors Caution: Nx must be > nx 

t N = vector containing the size and number of weight vectors in w 
% у = ones vector used to set up comparison of distances 

% d = vector which stores the distance for each code vector 

% md = minimum distance from data vector to a code word 

$ jw = index of minimum distance in d 

% x = subject data set arranged into vectors of appropriate size 

% w = weight matrix of codebook to be used for sorting 

% хі,х2,... = data sets of vectors for use in next stage 

% count = vector to track size of output data sets 


% this program performs the sorting of hte input data set for use by the sceond % level of t 


N=size(w); % initialize dimensions of w 
Nx=max(size(x)); 5 initialize number of input vectors 
mse=0; $ initialize mse 
y-ones(1,N(2)); $ initialize ones vector 
for k-1:N(2) Y this loops initializes the output data sets 
Bai x anE2sEE(k), 'szeros (IN(1) , Nx/4);* ]); 
count (k) =0; $ initialize size of output data sets 
end 
for kel:Nx Y main loop : execute once for each input vector 
d=sum((x(:, kK) *y-w) .^2); $ calculate distances for each code vector 
(та, iw] =min (d) ; $ find closest code vector 
count (iw) =count (iw) +1; $ update size of output data set chosen 
eval ((' x’, int2str(iw),’' (:, count (iw))=x(:,k);' J); У update output data set 
if rem(k,1000)==0, k, end % update progress to screen 
end 
for k=1:N(2) $ this loop truncates the output data sets to eliminate 


Y the unused portion of the allocated space 
evalt(x"mnE2str(X);'-x',int2str(k)," (:,l:count (k) 2); 10; 
end 
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% program to code an image for tsvq 


% 
$ 
% 
% 
% 
% 
% 
% 
% 
% 
% 
% 
% 
% 
% 


variables x = data set with image іп vector format 
w = weight matrix for vector quantizer at first level 
“1,42,... = weight matrices for vector quantizer at second level 
wa = weight matrix chosen for use at second level 
z = approximate image produced by coding in vector format 
mse = mean square error Of approximation, z 


Н = vector containing size and number of weight vectors in w 
Nx number of data vectors 

y ones vector used to set up comparison of distances 

а = vector which stores the distance for each code vector 

d2 vector distances for code book at second level 

md = the minimum distance contained ind 

md2 = the minimum distance contained in d2 

iw = the index of the code vector with minimum distance 

iw2 = index of closest code vector in level two 


€ this progrm performs coding for a two level TSVQ. The input and output 
€ images are both in vector format 


N-size(wl); 


Nx-max(size(x)); 


mse=0; 


y=ones(1,N(2)); 
z-zeros (N(1), Nx) ; 


for kzl:Nx 


initialize dimensions of w 

initialize number of input data vectors 
initialize mse 

initialize ones vector 

initialize output image 


oF P of of of 


$ main loop : execute once for each input vector 


d=sum((x(:,k) *y-w).*%2); 5 find distances for code book at first level 

(та, ім)-тіп (а); % find closest code vector at first level 
eval(['wa-w',int2str(iw),';']); % pick weight matrix to be used at level two 
d2=sum((x(:,k)*y-wa).*2); 5 find distances for code book at level two 


[md2, iw2]=min(d2); % find closest code vector at level two 
2(:, К) =ма (7, 19207; * place approximation in output image 
mse-msetsum((x(:,k)-z(:,k)).^2); %& increment mse 
if rem(k,1000)==0, k, end * update progess to screen 

end i 

mse=mse/ (Nx*N(1)); $ normalize mse 
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function z=mssort (x, w) 


* program to set up multi stage vq 


¥ input variables x = subject data set arranged into vectors of appropriate size 
% w= weight matrix of codebook to be used for sorting 

$ 

* output variable z = data set of error vectors for use in next stage 

% 

* local variables Nx = number of data vectors Caution: Nx must be > nx 

% М = vector containing the size and number of weight vectors in w 
% у = ones vector used to set up comparison of distances 

% d = vector which stores the distance for each code vector 

% md = minimum distance from data vector to a code word 

% iw = index of minimum distance ind 

$ this program takes a data set and a code book and performs on pass through each 
$ data vector, finding the closest code vector and calculating and storing the 

* error. This new data set is used for the next stage in Multi Stage Vector 

$ Quantization. 


N-size(w); 

Nx=max (size(x)); 
y=ones(1,N(2)); 
z=zeros(N(1),Nx); 


initalize number and size of weight vectors 
initialize number ofdata vectors 

initialize ones vector 

initialize error data set 


o? o? of of 


for k=1:Nx 
d=sum((x(:,k) *y-w) .*2); 
(та, ім)-тіп (4); 
z(:,k)=x({:,k)-w(:,iw});} 
if rem(k,1000)220, k, end 
end 


main loop : execute once for each data vector 
calulate distance for each code word 

find the minimum distance 

calculate and store the error vector 

update progress every 1000 data vectors 


e? o? o? o? off 
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function [z,mse]=mscode (x, w, wl) 


$ program to code 


an image for ms vq 


$ input variables x = data set with image in vector format 

% w = weight matrix for vector quantizer at first level 

% wl = weight matrix for vector quantizer at second level 

$ output variables z = approximate image produced by coding in vector format 
% mse = mean square error of approximation, z 

$ local variables N = vector containing size and number of weight vectors inw 
$ Nx = number of data vectors 

% x2 = data set containing first level error 

% y = ones vector used to set up comparison of distances 

% d = vector which stores the distance for each code vector 
% d2 = vector distances for code book at second level 

% md = the minimum distance contained in d 

3 md2 = the minimum distance contained in d2 

% iw = the index of the code vector with minimum distance 

% iw2 = index of closest code vector in level two 


се 


this program performs coding for the MSVQ algorithm. This version is 


* to a two level architecture. The input and output image are both in 


% vector format. 


N=size(w) ; 

Мхетах (5іге (х)); 
mse=0; 
y=ones(),N(2)); 
z-zeros(N(1),Nx); 


for kz-z1:Nx 


d-sum((x(:,k) *y-w).^2); 


[md, iw]zmin (d); 


х2т-х(:,К)-м(:,ім); 
d2=sum((x2*y-wl) .*2); 
[md2, iw2] -min(d2); 

2 (7, kh) =e (e718) tal (2, we) 
mse=msetsum((x(:,k)-z(:,k)).*2 
if rem(k, 1000)= 


end 
mse=mse/ (Nx*N(1)); 


5 initialize dimensions of w 
5 initialize number of data vectors 
5 initialize mse 
$ initialize ones vector 
$ initialize output image 
$ main loop : execute once for each data vector 
$ find distances for first level code book 
€ find closest code vector on first level 
€ form first level error 
5 find distances for second level code book 
% find closest code vector on second level 
€ form second approximation to input vector 
); * increment mse 

20, k, end 5 update progress to screen 


% normalize mse 


80 


sep 20 11:20 1991 class2.m Раде 1 


function [xl,x2)=classl(y) 


% 


o? 


е а of of oP gh GP oF oF 


program to convert an array for use 


in classified vq 


input variable y = subject image in cow format 
output variables xl = vector format array of edge blocks 
x2 = vector format array of shade blocks 
local variables N = vector which stores the dimensions of y 
n2 = height of desired block input by user 
nl = width of block input by user 
nla = number of blocks to process in horizontal direction 
п2а = number of blocks Lo process in vertical direction 
k = index to track number of blocks processed 
il = vertical placekeeper in subject image 
jl = horizontal placekeeper in subject image 
z = array used to evaluate edge detector ratio 


$ this program takes an image in row format applies an edge detector, and 
$ outputs two data sets in vector format. The first data set consists of 
the edge blocks, and the second consists of the shade pixels. 


% 

N-size(y); 

n2-input ('height of block 7% 
nl=input (’width of block Dr 


nla=floor (N(1)/n1); 
n2a-floor(N(2)/n2); 
ÉnI-ocdxrosintsum2nla*n2a); 
x2=x1; 

countl=0; 

count2=0; 

for i-1:n2a 


ТЕ күдігі 

for j-1:nla 
k=(i-1l)*nlatj; 
DE ЫИ тЫ, 
720m Ве 2-1, 31: )18п1-1); 
21=2(:); 


Zaye (21 (iy—zi(2)) /max(z1(1),21(2)); 
22 (2 аг 1 (17-21 (3) ) /тах (21(1},21(3)); 
22031202101) 29904) утах (210010)21 (4) ):; 
22009 7 D(2)—21:03)) 7maxCz1(29721(3)) 
22 (5)-(21(2)-21(4)) /(тах (21(2), 21 (4)); 
2216)/-(2113)-21(4))/тах(21(3),21(4)); 


if max(abs(z2)) > 0.4 
countl-countl*1; 
xl (:, count l)=21; 

else 
count2-count2*1; 
x2(:,count2) 221; 

end 

end 5 


епа 
xiwxbit1:cousntl); 
x2-2x2[(:,1:count2); 
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